lsemantica: A command for text similarity based on latent semantic analysis
نویسندگان
چکیده
منابع مشابه
lsemantica: A Stata Command for Text Similarity based on Latent Semantic Analysis
The lsemantica command, presented in this paper, implements Latent Semantic Analysis in Stata. Latent Semantic Analysis is a machine learning algorithm for word and text similarity comparison. Latent Semantic Analysis uses Truncated Singular Value Decomposition to derive the hidden semantic relationships between words and texts. lsemantica provides a simple command for Latent Semantic Analysis ...
متن کاملQuery expansion based on relevance feedback and latent semantic analysis
Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...
متن کاملA Similarity - based Probability Model for Latent Semantic IndexingChris
A dual probability model is constructed for the Latent Semantic Indexing (LSI) using the cosine similarity measure. Both the document-document similarity matrix and the term-term similarity matrix naturally arise from the maximum likelihood estimation of the model parameters, and the optimal solutions are the latent semantic vectors of of LSI. Dimensionality reduction is justiied by the statist...
متن کاملUsing Latent Semantic Analysis to Estimate Similarity
In three studies we investigated whether LSA cosine values estimate human similarity ratings of word pairs. In study 1 we found that LSA can distinguish between highly similar and dissimilar matches to a target word, but that it does not reliably distinguish between highly similar and less similar matches. In study 2 we showed that, using an expanded item set, the correlation between LSA rating...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Stata Journal: Promoting communications on statistics and Stata
سال: 2019
ISSN: 1536-867X,1536-8734
DOI: 10.1177/1536867x19830910